Information bottleneck

Claude Shannon‘s Rate Distortion theory formalizes the trade off between compression and the preservation of meaning (e.g., how higher-level categories can compress and preserve). Namely, given data and compressed representation , it seeks encodings that minimize where is the rate (number of bits) and is distortion.

Information bottleneck theory then argues to minimize the following when compressing into while preserving information about relevant variable :